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Abstract 

The proliferation of mobile handheld devices in combi¬ 
nation with the technological advancements in mobile 
computing has led to a number of innovative services 
that make use of the location information available on 
such devices. Traditional yellow pages websites have 
now moved to mobile platforms, giving the opportunity 
to local businesses and potential, near-by, customers to 
connect. These platforms can offer an affordable adver¬ 
tisement channel to local businesses. One of the mech¬ 
anisms offered by location-based social networks (LB- 
SNs) allows businesses to provide special offers to their 
customers that connect through the platform. We col¬ 
lect a large time-series dataset from approximately 14 
million venues on Foursquare and analyze the perfor¬ 
mance of such campaigns using randomization tech¬ 
niques and (non-parametric) hypothesis testing with sta¬ 
tistical bootstrapping. Our main finding indicates that 
this type of promotions are not as effective as anecdote 
success stories might suggest. Finally, we design classi¬ 
fiers by extracting three different types of features that 
are able to provide an educated decision on whether a 
special offer campaign for a local business will succeed 
or not both in short and long term. 

1 Introduction 

During the last years a number of location-based services 
and social media has emerged mainly due to the rapid pro¬ 
liferation of mobile handheld devices in combination with 
the technological advancements in mobile computing. Peo¬ 
ple can use these devices to obtain a wide range of informa¬ 
tion related to the geographic area they are currently in. Web 
services that have traditionally aimed at digitally connecting 
people with local businesses (e.g., Yelp, Urbanspoon etc.) 
are transforming to mobile. This transformation facilitates a 
real-time interaction between the involved parties through a 
two-way communication channel. For instance, Yelp users 
can initiate a mobile application on their devices and get in¬ 
stant information for locales that are within their reach. 

However, this mobile transformation of “yellow pages” 
services is beneficial to local businesses as well. Not only 
can they be discovered by people that are near-by, but most 
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importantly they have an immediate way of advertising to 
potential customers. One of the advertisement mechanisms 
allows venues to use such mobile platforms to provide spe¬ 
cial offers to customers that connect with them through these 
services. For instance, a venue on Foursquare can offer spe¬ 
cial deals to people that check-in to the locale through the 
application. The same is true for Yelp users, even though the 
actual details might differ. This can potentially be an inex¬ 
pensive way of advertisement for local businesses to people 
that are nearby and actually have the potential to visit them. 

Regardless of the actual way that a special promotion is 
published, it serves as a channel for local venues to adver¬ 
tise and attract more customers, which consequently can po¬ 
tentially translate to increased revenue. There are anecdote 
stories for businesses that exploit such opportunities to their 
benefit. For example, a burger joint in Philadelphia that of¬ 
fered a free beer with every Foursquare check-in is such a 
success story P"sq-success-stories 201 1[ ). In fact our data 
verify that the specific venue (denoted as vp) has benefited 
from Foursquare special offers. Nevertheless, conclusions 
drawn from similar bright examples are always affected by 
sampling bias. Hence, the goal of our paper is to analyze and 
model the effectiveness of special offers through location- 
based social media/networks at scale. This is the first work to 
analyze promotions offered through LBSNs. We would like 
to emphasize here that our study is not focused on any spe¬ 
cific platform (e.g.. Foursquare). On the contrary, our work 
is focused on the generic mechanism of promotions through 
LBSNs and our contribution is twofold; 

(i) We analyze the effectiveness of this mechanism using a 
large dataset we collected that includes time-series informa¬ 
tion from approximately 14 million venues on Foursquare. 
Given that we do not have access to actual revenue data 
for venues our evaluation metric is the number of check¬ 
ins in a venue. We examine both periods during a promotion 
and after it is completed. Our analysis combines random¬ 
ization and statistical bootstrapping. In brief, we use a ran¬ 
domly selected set of matched reference venues that have 
not offered any promotion during the data collection period 
to build a baseline for the probability of observing an in¬ 
crease in a venue’s check-ins. We then compare this prob¬ 
ability with the one computed using venues that have of¬ 
fered a promotion. We further use block bootstrapping for 
non-parametric hypothesis testing to identify the venues for 




which the change observed in their check-ins is statistically 
significant. Consequently, we obtain a more robust estimate 
of the aforementioned probabilities. Our main result indi¬ 
cates that the positive effects of special offers through 
LBSNs are more limited than what anecdote success sto¬ 
ries might suggest. In particular, the probability of an in¬ 
crease in the mean daily check-ins for a venue that offers 
a promotion is approximately equal to that of the matched 
reference venues that do not offer any promotion. Moreover, 
the standardized effect size on the daily check-ins is not very 
much different for the venues with promotions as compared 
to that of the reference venues. 

(ii) We investigate whether there are specific factors that 
can drive success for a promotion. In particular, we build 
classifiers, by identifying relevant features, that can provide 
an educated decision on whether a specific venue will en¬ 
joy positive benefits through a special offer campaign. The 
extracted features belong to three broad categories, that is, 
venue-related, promotion-related and geographical features. 
Our experiments indicate that we can achieve good classifi¬ 
cation performance. For instance, using simple models such 
as logistic regression we can achieve 83% accuracy with 
0.88 AUC. Interestingly, as we will elaborate on later, our 
model evaluations reinforce our findings from our statistical 
analysis, since the promotion-related features improve the 
classification performance only marginally. 

2 Our Dataset 

We used Foursquare’s public venue API during the pe¬ 
riod 10/22/2012-5/22/2013 and queried information for 
14 , 011,045 venues once every day. Each reading has the fol¬ 
lowing tuple format: <ID, time, # check-ins, # 
users, # specials, # tips, # likes, tip 
information, special information>. During 
the data collection period, there are 206,163 venues in total 
that have published at least one special offer. Approximately 
45% of these venues publish only one special. Furthermore, 
there are in total 735,034 unique special deals, with 88.68% 
of them being provided by venues in the US. 



Figure 1: “Frequency” and “Flash” specials are usually 
shorter than other types of specials, while the “Mayor” spe¬ 
cial often lasts for a longer time. 

At the time. Foursquare had 7 types of specials, namely, 
“Newbie”, “Flash”, “Frequency”, “Friends”, “Mayor”, 


“Loyalty” and “Swarm”, each requiring different conditions 
to be earned ( |Fsq-special-types 201 l| l. “Frequency” is the 
most popular one in our dataset, possibly because compared 
to other types appears to be the easiest one to be unlocked, 
covering approximately 86.5% of all the offers we collected. 

Another parameter of interest for the special offers is their 
time duration. Figure [T] presents the empirical CDF of the 
offer duration. As we can see, “Frequency” and “Flash” 
special offers usually are active for a short duration, while 
“Friends” and “Swarm” usually last for a longer time possi¬ 
bly due to their stricter requirements. The “Mayor” special 
often lasts even longer, since a customer needs to become the 
Foursquare mayor of the venue to unlock the deal. The may¬ 
orship is only awarded to the user who has the most check¬ 
ins in the venue during the last two month^ 

As alluded to above, a venue might offer multiple specials 
during the 7-month data collection period. These multiple 
specials can be fully overlapped (i.e., they start and end at 
the same time), partially overlapped, or sequential. We fur¬ 
ther define a promotion period of a venue to be a contin¬ 
uous time period that the venue provides at least one offer 
and does not include more than two consecutive days with¬ 
out a special offer. In our dataset, approximately half of the 
promotions last for more than a week. While a promotion 
as defined above can include multiple individual offers, for 
simplicity we will use the terms promotion, offer, campaign 
and deal interchangeably in the rest of the paper. 

Finally, Foursquare associates each venue v with a cate¬ 
gory/type T{v) (e.g., cafe, school etc.). This classification 
is hierarchical. At the top level of the hierarchy there are 
9 categories; Nightlife Spots, Food, Shops & Services, Arts 
& Entertainment, College & University, Outdoors & Recre¬ 
ation, Travel & Transport, Residences and Professional & 
Other Places. From these types, “Food”, “Nightlife Spots” 
and “Shops & Services” have the highest chances of offer¬ 
ing a special deal (0.025, 0.04 and 0.016 respectively). This 
can be attributed to the fact that the majority of the venues in 
these categories are commercial and hence, advertisement is 
most probably among their priorities. 

3 Effectiveness of Special Offers 

Evaluation metric: Our data are in a time-series format and 
we also know the start {tg) and the end (fg) times of the pro¬ 
motion period. Using these points we split each time-series 
to three parts that span the following periods: (i) before the 
special campaign, [/o,fs-i], (ii) during the special cam¬ 
paign, \tg,tf\, and (hi) after the special campaign, [fe+i i in]- 
The key idea is to examine and analyze the changes that oc¬ 
cur on the daily check-ins across these three time periods. 

Data processing: Let us denote the original time-series 
collected for the check-ins in venue v with Cav[t]- Simply 
put, Cav[t] is the accumulated number of check-ins in v 
at time t. As aforementioned we obtain one reading every 
day for every venue. However, consecutive readings might 
not be exactly equally-spaced in time due to a variety of 
reasons (e.g., network delays, API temporal inaccessibility 

*The newest version of Foursquare does not include the notion 
of mayor anymore. 
















etc.). Hence, we transform each time series to the intended 
reference time-points using interpolation. For the rest of the 
paper Cq^[t] will represent the interpolated time-series for 
the total number of check-ins in v with = 24 hours. 

We focus on campaign periods of venues in the US that 
last for at least 7 days and for which we have enough points 
in the time-series before the special offer (i.e., at least 4 
weeks). This allows us to build a representative baseline for 
the venue popularity prior to the promotion. The above fil¬ 
ters provide us with a final dataset of 40,071 promotion pe¬ 
riods that we use in our analysis, offered by 36,567 venues. 
We refer to this dataset as the promotion dataset. Note here, 
that only a subset of those can be used for studying the long¬ 
term effect of the promotion. In particular, for 26,355 of 
them we have enough points in the time-series after the spe¬ 
cial offer, and we use them for the long-term effect study. 

Since our metric of interest is the daily check-ins, we uti¬ 
lize the first-order difference of the aggregated time series: 

Cv[t] = Cay[T] - Cay[T - 1] ( 1 ) 

The time-series we collected might exhibit biases that af¬ 
fect our analysis. For instance, a change in a venue’s daily 
check-ins might simply be a result of a change in the popu¬ 
larity of the social media application. Moreover, seasonality 
effects can distort the contribution of the campaign on c„ [t]. 
To factor in our analysis similar potential sources of bias we 
use a randomly selected, matched, reference group of venues 
that can account for the effects of similar externalities. 

3.1 Promotion dataset analysis 

We begin by examining the fraction of promotions that en¬ 
joy an increase in the mean number of check-ins per day. 
Let us denote the mean check-ins per day in venue v be¬ 
fore the promotion (i.e., during the period [ts-k, ^s-i]) with 
. We similarly define the average check-ins per day in v 
during (i.e., in the time period te\) and after (i.e., in the 
time period [fg+i, tg+u,]) the promotion campaign as 
and respectively. To reiterate, in order to build a con¬ 
crete baseline for the period prior to the promotion we set 
k = 28 days. In order to study the long term effect of the 
promotion we would like to have a stabilized time interval 
after the campaign is over. Hence, we include in our analysis 
only the venues for which we have data for at least 7 days 
after the end of the promotion. Consequently, we set w = k, 
if we have 28 days of data after the promotion. Otherwise 
we set w equal to the number of time-points available (i.e., 
7 < w < 28). 

Given this setting we first compute the difference — 
(rriy^ — m\^). A positive sign essentially translates to 
an increase in the average daily check-ins during (after) the 
promotion period. Figure depicts our results. As we can 
see, the fraction of venues in the promotion group that enjoy 
an increase in their check-ins during the promotion is ap¬ 
proximately 50%, while a smaller fraction (about 35%) ex¬ 
hibits an increase after the offer is ceased. There is also some 
variation observed based on the venue type, with some cate¬ 
gories exhibiting a larger fraction of venues with an increase 
(e.g., nightlife). However, part of this variability might be 


attributed to the fact that for some categories we have a very 
small sample in the promotion set (e.g., we only have 128 
promotions in Outdoors and 30 in Residence). 

In summary, a large fraction of venues exhibit increase in 
their check-ins during and after the special offer. However, 
an almost equal proportion of venues does not enjoy an in¬ 
crease in the average daily check-ins. Next we delve further 
into the details of the effectiveness of local promotions. 

3.2 Reference venues 

Our results above clearly cannot establish any causal relation 
between promotion campaigns and observed changes in the 
daily check-ins. This would require careful design of field 
experiments. However, it is not possible in our work since 
we only have access to observational data. The direct com¬ 
parison between venues that offer promotions and those that 
do not, can be affected by a self-selection bias of the pro¬ 
motion venues; venue owners might not randomly decide 
whether to offer a deal, but other confounding factors might 
affect this decision. 

Therefore, in order to account for these confounding fac¬ 
tors and other externalities, we opt to get a baseline for com¬ 
parison by utilizing techniques for quasi-experimental stud¬ 
ies. In particular, we randomly sample a reference group 
from the set of venues with no promotion, such that the 
distribution of specific observed features of this sample 
matches that of the promotion group. This of course assumes 
that there is no selection bias based on unobserved charac¬ 
teristics. The features we use for matching are the location 
as well as the type of the venue. The reference group also 
ensures that on average the venues at both groups will ex¬ 
perience similar externalities (e.g., seasonal effects, effects 
related to the popularity of Foursquare etc.). Once the ref¬ 
erence group is obtained, we sample the empirical promo¬ 
tion period distribution of the promotion venues and assign 
pseudo-promotion periods to the reference group venues. 
Consequently we perform the same analysis described in the 
previous section on the reference group. 

Our results from 20 non-overlapping reference groups are 
also depicted in Figure]^ where the 95% confidence inter¬ 
vals are also presented. As we can see the fraction of venues 
enjoying an increase in the promotion group is higher com¬ 
pared to that in the reference group. If we denote with Id 
(la) the event of an increase for m)! (to“ ), with S the 
event of a venue offering a special deal and with E the 
various environmental externalities that are present, the ref¬ 
erence group opts to obtain an estimate for the probabil¬ 
ity P{Id\E). On the other hand, the promotion group in¬ 
cludes an additional externality, the presence of a promo¬ 
tion. Hence, with the promotion group we are able to es¬ 
timate P(Id\S, E). Our results indicate that P(Id\S^ E) > 
P{Id\E) and P{Ia\S,E) > P(Ia\E) when considering all 
types of venues. However, the difference between these two 
probabilities is only about 0.1 for both the short and long 
term. 

Another aspect related with the potential effectiveness of 
the promotion campaign is the actual effect size of the ob¬ 
served change. The degree of this change can be captured 
through the standardized effect size of Cohen’s d: 
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(a) Short-term (b) Long-term 

Figure 2: Fraction of venues exhibiting an increase in the mean daily check-ins. 
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Figure 3: Both the promotion and reference groups enjoy similar effect sizes. 


Spooled 

where Spooled is the pooled standard deviation of the 
two samples (before and during the promotion). Figure 
presents the empirical CDF for the observed standardized 
effect sizes in both the promotion and the reference groups 
for the short term. The results for the long term are simi¬ 
lar and omitted due to space limitations. For the reference 
groups we also present the 95% confidence intervals of the 
distributions. As we can observe there is a shift in the dis¬ 
tribution for the promotion group, which is different for dif¬ 
ferent categories. However, this shift is very small. Further¬ 
more, an interesting point to observe is the jump at the ref¬ 
erence groups’ ECDF at d = 0. This means that there is a 
non-negligible fraction of venues in the reference group that 
have exactly the same mean for the two periods compared. 
We come back to this observation in the following section. 

3.3 Bootstrap tests 

Our results above indicate that a large number of venues ex¬ 
hibit small effect sizes, which might not represent robust 
observations. Therefore, in this section we opt to identify 


and analyze the promotions in our dataset that are associ¬ 
ated with a statistically significant change in their check-ins. 

Given our setting, the following two-sided hypothesis test 
examines whether their is a statistically significant change 
observed in the short-term: 

Ho : (3) 

Hi : ^ (4) 

If the p-value of the test is less than a, then there is strong 
statistical evidence that we can reject the null-hypothesis (at 
the significance level of a). The sign of the observed dif¬ 
ference will further inform us if the change is positive. In 
our analysis we pick the typical value of a = 0.05. If we 
want to examine the long-term effectiveness of special deals 
we devise the same test as in Equations Q and Q, where 
we substitute with m“ . We choose to rely on bootstrap 
for the hypothesis testing rather than on the t-test to avoid 
any assumption for the distribution of the check-ins. Boot¬ 
strap also allows us to estimate the statistical power tt of the 
performed test. This is important since an underpowered test 
might be unable to detect statistically significant changes es¬ 
pecially if the effect size and/or the sample size are small. 
Consequently, this can lead to underestimation of the cases 
where the alternative hypothesis is true. 




























































































































Statistical bootstrap ( Efron and Tibishirani 1993] l is a ro¬ 
bust method for estimating the unknown distribution of a 
population’s statistic when a sample of the population is 
known. The basic idea of the bootstrapping method is that in 
the absence of any other information about the population, 
the observed sample contains all the available information 
for the underlying distribution. Thus, resampling with re¬ 
placement is the best guide to what can be expected from the 
population distribution had the latter been available. Gener¬ 
ating a large number of such resamples allows us to get a 
very accurate estimate of the required distribution. Further¬ 
more, for time-series data, block resampling retains any de¬ 
pendencies between consecutive data points ( Kiinsch 1989| l. 

In our study we will use block bootstrapping with a block 
size of 2 to perform the hypothesis tests. When performing a 
statistical test we are interested in examining whether under 
the null hypothesis, the observed value for the statistic of in¬ 
terest was highly unlikely to have been observed by chance. 
In our setting, under Hq the two populations have the same 
mean, i.e., = 0. Hence, we hrst center both 

samples, before and during the special, to a common mean 
(e.g., zero by subtracting each mean respectively) in order 
to make the null hypothesis true. Then we bootstrap each 
of these samples and calculate the difference between the 
new bootstrapped samples. By performing B = 4999 boot¬ 
straps, we are able to build the distribution of the difference 
toJ? — m\ under Hq. If the (1 — a) confidence interval of 


under the null hypothesis does not include the 
observed value from the data, then we can reject Hq. An 
empirical p-value can also be calculated by computing the 
fraction of bootstrap samples that led to an absolute differ¬ 
ence greater than the one observed in the data. 

With statistical bootstrapping we can further estimate the 
power TT of the statistical test performed, tt is the conditional 
probability of rejecting the null hypothesis given that the al¬ 
ternative hypothesis is true. For calculating tt we start by 
following exactly the same process as above, but without 
centering the samples to a common mean. This will allow us 
to build the distribution of under H^. Then the 

power of the test is the overlap between the critical region 
and the area below the distribution curve under Hi. 

We have applied the bootstrap hypothesis test on our pro¬ 
motion and reference groups. Figure presents our results 
for all types of venues. Similar behavior is observed for spe¬ 
cific venue categories. However, due to space limitations, the 
results are omitted. In particular, we calculate the fraction 
of promotions associated with a statistically significant in¬ 
crease in the average daily check-ins. Note that we consider 
only the promotions whose p-value is less than a = 0.05 
or TT > 0.8 (the latter is a typical value used and increases 
our confidence that failure to reject Hq was not due to an 
underpowered test). As we can see, in this case the fraction 
of venues that exhibit an increase in the average daily check¬ 
ins is the same for both groups, i.e., P{Id\S, E) « P{Id\E). 
This suggests that the presence of a local promotion and 
the increase in the average check-ins are conditionally in¬ 
dependent given the externalities E\ For the long term we 
see a smaller fraction of promotion venues enjoying a posi¬ 
tive change in their check-ins. While the reasons for this are 
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Figure 4; When considering venues with robust changes in 
their check-ins the effect of local promotions disappear. 



Short-term Long-term 




standardized effect si 


(a) Short-term 


(b) Long-term 


Figure 5; Small effect sizes do not provide robust observa¬ 
tions based on our bootstrap tests. 


not clear, recent literature has repotted similar findings in 
a tangential context. Byers et al. ( |Byers, Mitzenmacher, and| 
Zervas 2012| l found that venues offering Groupon deals see a 
reduction in their Yelp ratings after the promotion. Along the 
same lines. Foursquare venues that offer promotions appear 
more probable to see a reduction in their daily check-ins. 
Unfortunately, more than half of the venues in our datasets 
are not rated and hence, we cannot directly examine the ef¬ 
fect of promotions on the rating. 

More importantly though, in the previous section we em¬ 
phasized on the fact that the reference group includes a 
large proportion of venues with d = 0. This clearly re¬ 
duces the fraction of venues in the reference groups that 
have d > 0 leading to smaller bars for the reference group 
in Figure A further examination of these cases shows 
that the vast majority of these venues exhibit 0 check-ins 
over the whole period. These data points do not represent 
real venues, but are venues that correspond to events such 
as extreme weather phenomena, traffic congestion, poten¬ 
tially spam venues etc. Hence, we can remove these venues 
from our reference groups. After doing so we are able to re¬ 
cover the results presented in Figure|^further supporting the 
conditional independence between an increase in the mean 
number of check-ins per day and promotions. Note that our 
bootstrap tests for these venues are extremely underpowered 
(practically there is not any distribution since every observa¬ 
tion is 0) and hence, are not included in the results presented 
in Figure As we can further see from the plateau around 
d = 0 in Figurej^that depicts the empirical CDF of Cohen’s 
d for the venues used in Figure small effect sizes do not 
constitute robust observations. Of course this can either be 
due to the low power of the test to detect a small effect size, 
or due to the actual non-existence of any effect. 
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Figure 6: Our data support anecdote success stories for vp. 


3.4 Anecdote evaluation 

As mentioned in the introduction there are various anecdote 
stories supporting the effectiveness of promotions through 
LBSNs such as the one for vp. At this part of our study 
we want to examine what our data imply for this specific 
venue and to verify whether our data and analysis are able 
to recover known ground truth, vp publishes a special deal 
on the 37*^ day of the data collection, which lasts until 
the end of the collection period. Therefore, we can only 
examine the short-term effectiveness. The standardized ef¬ 
fect size observed is approximately 0.52, while our boot¬ 
strap test indicates that this increase is statistically signifi¬ 
cant. This is in complete agreement with reports about the 
specific venue ( |Fsq-success-stories 201 1[ ). Figure further 
presents the bootstrap distribution of mj? — under Hq 
and Hi. 

4 Models for Local Promotions 

In this section we want to examine whether there are specific 
attributes that contribute to the success of a promotion. For 
this we build models that can provide an educated decision 
on whether a special deal will “succeed” or not consider¬ 
ing the short and the long term separately treating them as 
two different binary classification problems. Based on the 
bootstrap tests the positive class includes the offers that ex¬ 
hibit statistically significant increase in (m“ ), while the 
negative class includes the special deals with a statistically 
significant decrease or a failure to reject the null hypothesis 
with a powerful test (tt > 0.8). We begin by extracting three 
different types of features. Note that some of these features 
are specific to promotions, while others aim to capture other 
factors that can affect the popularity of a venue in general 
(e.g., neighborhood urban form etc.). We then evaluate the 
predictive power of each individual feature using a simple 
unsupervised learning classifier. We further build a super¬ 
vised learning classifier to predict the effect of a special deal 
using the extracted features. 

4.1 Feature extraction 

Venue-based features (7^„): The set includes features 
related with the properties of the venue publishing the spe¬ 
cial deal. The intuition behind extracting such type of fea¬ 
tures lays on the fact that the effectiveness of the special 
offer can be connected to the characteristics of the venue it¬ 
self For instance, a special deal might not help at all a really 


unpopular venue but it might be a great boost for a venue 
with medium levels of popularity. 

Venue type: This is the top-level type T{v) of venue v. 
Table[T]depicts the fraction of special deals offered from dif¬ 
ferent types of venues that are associated with a statistically 
significant increase in the daily number of check-ins c; i.e., 
the conditional probability P{I\T{v)). 

Popularity: For the venue popularity we use two separate 
features; (i) the mean number of check-ins per day at the 
venue for the period before the special offer starts, m\ and 
(ii) the cumulative number of check-ins in v just before the 
beginning of the special offer, 

Loyalty: We define the loyalty A of users in venue v as; 


where pa„[ts-i] is the accumulated number of unique users 
that have checked-in to venue v at time tg-i- At a high-level 
A indicates the average return (check-in) rate of users in v. 

Likes: Foursquare allows users to like or dislike a venue. 
We will use the accumulated number of likes a 

venue has received (at time tg-i) as a feature for our classi¬ 
fiers. 

Tips: Foursquare allows users to leave short reviews for 
the venues. We use the total number of such reviews (tips in 
Foursquare’s terminology) for venue v up to time 

ts-i as a feature for our classifiers. 

Promotion-based features (Jp): The set Jp includes 
features related to the details of the special offer(s) that ex¬ 
ist during the promotion period. The details of the deal(s) 
might be important on whether the promotion will succeed 
or not. For instance, a short-lived offer might have no impact 
because people did not have a chance to learn about it. 

Duration: The duration D is the promotion period length. 
Intuitively, a longer duration allows users to learn and 
“spread the word” about the promotion, which consequently 
will attract more customers to check-in to the venue. 

Type: There are 7 types of special deals that can be of¬ 
fered from a Foursquare venue during the promotion period. 
Each type provides different kind of benefits but has also 
different unlocking constrains. Tablej^shows the probability 
distribution of the positive class conditioned on the different 
types of special offers that are part of the promotion. 

If a venue publishes two (or more) different types of deals 
we refer to this as “Multi-type” offer. In order to be able to 
easily distinguish between different combinations of offers 
in this “Multi-type” deals, we encode this categorical fea¬ 
ture in a binary vector € {0,1}^, where each element 
represents a special type. “Multi-type” promotions will have 
multiple non-zero elements. 

Count: Count Ns is the average number of special deals 
per day associated with a promotion period. Ng captures how 
frequently a venue published specials during a specific pro¬ 
motion period. Note that is a binary vector and hence, if 
a venue is offering two deals of the same type this can only 
be captured through Ng. 

























Table 1: Probability for the positive class conditioned on the type of the venue. 


Category 

Nightlife 

Food 

Shops 

Arts 

College 

Outdoors 

Travel 

Residence 

Professional 

% Positive 
class 

short-term 

62.07% 

57.74% 

42.90% 

52.87% 

56.25% 

58.33% 

66.84% 

54.54% 

61.86% 

long-term 

50.00% 

41.51% 

28.22% 

43.75% 

37.04% 

25.00% 

53.80% 

14.29% 

39.68% 


Table 2: Probability distribution of the positive class conditioned on the different types of special offers. 


Type 

Newbie 

Flash 

Frequency 

Friends 

Mayor 

Loyalty 

Swarm 

Multi-type 

% Positive 
class 

short-term 

62.24% 

60.00% 

45.56% 

84.62% 

67.74% 

50.50% 

57.14% 

60.60% 

long-term 

59.32% 

62.50% 

30.07% 

43.75% 

54.84% 

50.00% 

0.00% 

44.23% 


Geographical features The effectiveness of a pro¬ 

motion can be also related to the urban business environment 
in the proximity of the venue. The latter can be captured 
through the spatial distribution of venues. For example, an 
isolated restaurant might not benefit from a special deal pro¬ 
motion, simply because people do not explore the specihc 
area for other attractions. For our analysis, we consider the 
neighborhood Af{v, r) of a venue v to be the set of venues 
within distance r miles from v (we use r = 0.5). 

Density: We denote the number of neighboring venues 
around v as the density of N{v, r). Hence, 

Py = \JV{v,r)\ (6) 

Area popularity: The density captures a static aspect 
of u’s neighborhood. To capture the dynamic aspect of the 
overall popularity of the area, we extract the total number of 
check-ins observed in the neighborhood at time tg-i: 

4^v — ^ ^ (7) 

v' (v ,r) 


Intuitively, a more popular area could imply higher likeli¬ 
hood for Foursquare users and potential customers to learn 
about the promotion and be influenced to visit the venue. 

Competitiveness: A venue v of type T{v), will com¬ 
pete for customers only with neighboring venues of the 
same type. Hence, we calculate the proportion of neighbor¬ 
ing venues that belong to the same type T{v): 

\v'eJ^iv,r) AT{v') = T{v)\ 

Pv 

Neighborhood entropy: Apart from the business density 
of the area around v, the diversity of the local venues might 
be important as well. To capture diversity we typically rely 
on the concept of information entropy. In our setting we cal¬ 
culate the entropy of the distribution of the venue types in 
Af{v, r). With fx being the fraction of venues in JV{v, r) of 
type T the entropy of the neighborhood around v is: 


4.2 Predictive power of individual features 

We now examine the predictive ability of each of the nu¬ 
merical features described above in isolation. We will com¬ 
pare descriptive statistics of the distribution of each feature 
(in particular the median) for the two classes. We will then 
compute the ROC curve for each feature considering a sim¬ 
ple, threshold-based, unsupervised classification system. 

Mann-Whitney U test for each feature’s median: A 
specific numerical feature X can be thought of as being 
strongly discriminative for a classification problem, if the 
distributions of X for the positive and negative instances are 
“significantly” different. To that end we examine the sample 
median of these distributions by performing the two-sided 
Mann-Whitney U test for the median values in the positive 
and negative classes for each of the features. The p-values of 
these tests are presented in Table 

ROC curves for individual features: We now compute 
the ROC curve for each feature based on a simple unsu¬ 
pervised classifier. The latter considers each feature X in 
isolation and sets a threshold value for X that is used to 
decide the class of every instance in our dataset. For each 
value of this threshold we obtain a true-positive and false¬ 
positive rate. We further calculate the area under the ROC 
curve (AUC). Interestingly, there is a connection between 
the Mann-Whitney U test and the AUC given by (jCortes and 
IMohri 2003| l: 


AUC = - (10) 

where U is the value of the Mann-Whitney U test statis¬ 
tic, Up is the number of positive instances and n„ is the 
number of negative instances. Table presents the values 
for AUC. As we observe while there are some features that 
deliver a good performance (e.g., and Ns) most of the 
features give a performance close to the random baseline of 
0.5. Hence, each feature individually does not appear to be 
a good predictor for the effect of special offers through LB- 
SNs. However, in the following section we will examine a 
supervised learning approach utilizing combinations of the 
features. 


£v = -^ It- logifr) 

T(^r 

where, T is the set of all (top-level) venue types. 


4.3 Supervised learning classifiers 

In this section we turn our attention to supervised learning 
models and we combine the extracted features to improve 
the classification performance achieved by each one of them 
































Table 3: While the median of the features for the two classes 
are significantly different, the actual distribution appear to 
not be discriminative (low AUC). 


Features 

short-term 

long-term 

AUC 

p-value 

AUC 

p-value 

Py 

Cav [^s —l] 

0.537 

Itr^ 

0.519 

0.047 


0.799 

0 

0.702 

0 

[^s —l] 

0.526 


0.535 

Tcr^ 

I'V [^s —l] 

0.537 

10=3“ 

0.557 

0 

Nt„[G-i] 

0.510 

0.178 

0.546 

10“' 


D 

0.539 

10“' 

0.520 

0 

Ns 

0.617 

0 

0.609 

0 


Pv 

0.551 

0 

0.551 

10““ 


0.558 

0 

0.558 

ICT^ 

K,v 

0.565 

0 

0.557 

ICT^ 

£v 

0.559 

0 

0.574 

0 


individually. We evaluate various combinations of the three 
types of features, while our performance metrics include ac¬ 
curacy, F-measure and AUC. Furthermore, we examine two 
different models, a linear one (i.e., logistic regression) and 
a more complex based on ensemble learning (i.e., random 
forest). 

We begin by evaluating our models through 10-fold cross 
validation on our labeled promotion dataset. The results for 
the different combinations of features and for the different 
classifiers are shown in Tablej^ As the results indicate, even 
when we use simple linear models the performance is sig¬ 
nificantly improved compared to unsupervised models. It 
is also interesting to note that the most important type of 
features appears to be the venue-based features T„. The 
promotion-based as well as the geographic features while 
improving the classification performance when added, do 
not provide very large improvements. 

The above models were built and evaluated on the data 
points identified through the bootstrap statistical tests in an 
effort to keep the false positives/negatives of the labels low. 
However, while this is important for building a robust model, 
in a real-world application the model will need to output 
predictions for cases that might not provide statistically sig¬ 
nificant results a posteriori. After all, a venue owner is in¬ 
terested in what he observes, and not whether this was a 
false positive/negative (i.e., an increase/decrease that hap¬ 
pened by chance). Hence, we test the performance of our 
models on the data points in the promotion group for which 
we were not able to identify a statistically significant change 
(a = 0.05) in the average number of check-ins per day. 
A positive observed value of d corresponds to the positive 
class. Note that we do not use these points for training. This 
resembles an out-of-sample evaluation of our models, test¬ 
ing their generalizability to less robust observations. Our re¬ 
sults are presented in Table As we can see, while as one 
might have expected the performance is degraded compared 
to the cross-validation setting, it is still good. 

Finally we focus on the results from logistic regression, 
which has a genuine probabilistic interpretation. In particu¬ 
lar, the accuracy performance when using the set of features 


J-yUJ-g and TpUTyUTg is very similar. We compute the 
actual outcome of the model, i.e., before applying the clas¬ 
sification threshold, which is the probability of observing 
an increase in the mean daily check-ins of the correspond¬ 
ing venue. Hence, the outcome of the two models provide 
the probabilities P{I\J-y, Tg) and P{I\Py,Pg,Pp) respec¬ 
tively. We calculate the difference between these probabili¬ 
ties for all the corresponding cases in Tables]^ andTable 
[^presents the root mean square differences, which is small 
for all the scenarios. Since features Py and Pg capture vari¬ 
ous (environmental) externalities, while the set Pp captures 
attributes related with the promotion itself, these results fur¬ 
ther support our findings from our statistical analysis. Of 
course these features do not capture all the externalities, and 
thus the actual probabilities might differ, even though the 
classification outcome is very accurate. 

Table 6; The root mean square distance of the logistic regres¬ 
sion output for the features Py\JPg and Pp\JPyVJPg further 
supports our statistical analysis. 


Cross-validation 

Out-of-sample 

short-term 

long-term 

short-term 

long-term 

0.081 

0.067 

0.072 

0.074 


5 Related Work 

Effects of Promotions: There are studies in the man¬ 
agement science that examine the impact of promotions 


on marketing. For example, ( |Blattberg, Briesch, and Fot 
| 1995| l found that temporary discounting substantially in¬ 
creases short term brand sales. However, its long term ef¬ 
fects tend to be much weaker. This pattern was further quan¬ 
tified by ( Pauwels, Hanssens, and Siddarth 200^ who found 
that the significant short time promotion effects on customer 
purchases die out in subsequent weeks or months. Further¬ 
more, Srinivasan et al. ( |Srinivasan et al. 2004| l quantified the 
price promotion impact on two targeted variables, namely, 
revenues and total profits, by using vector autoregressive 
modeling. The authors found that the price promotion has 
a positive impact on manufacture revenues, but for retailers 
it depends on multiple factors such as brand and promotion 
frequency. Finally, (Kopalle, Mela, and Marsh 1999| pro¬ 
posed a descriptive dynamic model which suggests that the 
higher-share brands tend to over-promote (i.e., offer promo¬ 
tions very frequently), while the lower-share brands do not 
promote frequently enough. 

Online Deals and Advertising: Online promotions have 
gained a lot of attention in recent literature. Such promotions 
have been a popular strategy for local merchants to increase 
revenues and/or raise the awareness of potential customers. 
A detailed business model analysis on Groupon was first 
presented by (Arabshahi 20101, while in ( |Dholakia 2010[ ) 
the authors surveyed businesses that provide Groupon deals 
to determine their satisfaction. Edelman et al. ( [EdelmanT] 
Jaffe, and Kominers 201 I j l considered the benefits and draw¬ 
backs from a merchant’s point of view on using Groupon 
and provided a model that captures the interplay between 
advertising and price discrimination effects and the potential 














































Table 4: Using supervised learning models improves the performance over unsupervised learning methods. 


Algorithm 

Feature 

short-term 

long-term 

Accuracy 

F-measure 

AUC 

Accuracy 

F-measure 

AUC 


z-p 

0.582 

0.474 

0.583 

0.684 

0.139 

0.642 


Tv 

0.831 

0.836 

0.882 

0.826 

0.68 

0.876 

Logistic 

To 

0.569 

0.532 

0.579 

0.686 

0.029 

0.582 

Regression 

TpVJTv 

0.833 

0.835 

0.885 

0.831 

0.697 

0.876 


TpVjTg 

0.588 

0.52 

0.618 

0.684 

0.128 

0.641 


Tv^jTg 

0.83 

0.835 

0.882 

0.827 

0.687 

0.876 


TpVjTvyjTg 

0.834 

0.836 

0.885 

0.833 

0.704 

0.876 


Tp 

0.681 

0.672 

0.76 

0.685 

0.349 

0.702 


^ V 

0.856 

0.846 

0.931 

0.86 

0.761 

0.9 

Random 

Tg 

0.559 

0.523 

0.578 

0.646 

0.285 

0.576 

Forest 

Tp'^Tv 

0.87 

0.862 

0.943 

0.868 

0.777 

0.909 


TpViTg 

0.666 

0.652 

0.74 

0.685 

0.396 

0.697 


^ i; U.^ g 

0.856 

0.846 

0.934 

0.862 

0.765 

0.904 


Tp'OTv'OTg 

0.87 

0.861 

0.94 

0.863 

0.765 

0.91 


Table 5: Our supervised models deliver good performance on out-of-sample evaluation on the less robust observations. 


Algorithm 

Feature 

short-term 

long-term 

Accuracy 

F-measure 

AUC 

Accuracy 

F-measure 

AUC 


Tp 

0.484 

0.353 

0.486 

0.532 

0.21 

0.543 


^ V 

0.625 

0.696 

0.678 

0.589 

0.436 

0.659 

Logistic 

Tg 

0.497 

0.442 

0.5 

0.518 

0.011 

0.527 

Regression 


0.628 

0.683 

0.654 

0.596 

0.509 

0.651 


TpVjTg 

0.491 

0.389 

0.494 

0.524 

0.133 

0.553 


Tv^jTg 

0.625 

0.695 

0.677 

0.592 

0.459 

0.657 


TpVjTvVjTg 

0.627 

0.683 

0.656 

0.6 

0.521 

0.653 


Tp 

0.532 

0.552 

0.54 

0.52 

0.338 

0.56 


Tv 

0.641 

0.681 

0.676 

0.608 

0.628 

0.628 

Random 

Tg 

0.503 

0.469 

0.508 

0.522 

0.283 

0.513 

Forest 

TpViTv 

0.639 

0.681 

0.676 

0.611 

0.63 

0.631 


TpViTg 

0.525 

0.541 

0.539 

0.526 

0.356 

0.547 


Tv'OTg 

0.643 

0.682 

0.678 

0.61 

0.627 

0.63 


TpVtTv^Tg 

0.643 

0.682 

0.677 

0.612 

0.628 

0.631 


benefits to merchants. Finally, Byers et al. ( |Byers, Mitzen-| 
macher, and Zervas 2012| l designed a predictive model for 
the Groupon deal size by combining features of the offer 
with information drawn from social media. They further ex¬ 
amined the effect of Groupon deals on Yelp rating scores. 


Tangential to our work is also literature on web adver¬ 


tising and its efficiency. In this space, Fulgoni et al. ( Ful 
goni and Morn 2008]l present data for the positive impact of 


online display advertising on search lift and sale lift, while 
Goldfarb et al. ( [Goldfarb and Tucker 2011) further exam¬ 
ined the effect of different properties of display advertis¬ 
ing on its success through traditional user surveys. Papadim- 
itriou et al. ( |Papadimitriou et al. 201 It study the impact of 
online display advertising on user search behavior using a 
controlled experiment. 

Mobile Marketing and Social Media: Mobile market¬ 
ing serves as a promising strategy for retail businesses to 
attract, maintain and enhance the connection with their cus¬ 
tomers. Sliwinski ( |Sliwinski 2002 1 built a prototype appli¬ 
cation that utilizes customer spatial point pattern analysis 
to target potential new customers. Furthermore, Baneijee et 
al. ([Banerjee and Dholakia 20081 studied the effectiveness 


of mobile advertising. Their findings indicate that the actual 
location of the participant as well as the context of that lo¬ 
cation, significantly influence the potential effectiveness of 
these advertising strategies. Recently, there have also been 
efforts to quantify through models ( [Baccelli and Bolot 2011 1 
the financial value of location data, which are in the center 
of mobile marketing operations. 


In another direction, location-based social media have 
gained a lot of attention. Data collected from such platforms 


can drive novel business analysis. Qu and Zhang (Qu and 


Zhang 2013 1 proposed a framework that extends traditional 


trade area analysis and incorporates location data of mobile 


users. As another example, Karamshuk et al. (Karamshuk et 
al. 2013|l proposed a machine learning framework to predict 


the optimal placement for retail stores, where they extracted 
two types of features from a Foursquare check-in dataset. 
Furthermore, these platforms can serve as mobile “yellow 
pages” with business reviews that can influence customer 
choices. For example, Luca ( Luca 20TT] l has identified a 
causal impact of Yelp ratings on restaurant demand using 
the regression discontinuity framework. 




























































6 Discussion and Limitations 

We would like to reiterate that this study should not be seen 
as a study on Foursquare per se. Our work is focused on 
the mechanism of promotions through location-based social 
media. Our results suggest that the benefits from local pro¬ 
motions through LBSNs are more limited than what anec¬ 
dote stories suggest. However, we acknowledge again that 
the time-series of daily check-ins is only a proxy for the ac¬ 
tual revenue generated. Nevertheless, we believe that even 
if the specific check-ins do not lead to direct revenue, they 
increase the visibility of the venue, at least within the ecosys¬ 
tem of social media. Note here that even though the potential 
of special campaigns through geo-social media appears to be 
limited, even a small increase in the probabilities P{Id\S, E) 
and P{Ia\S,E) as compared to P(/d|£’) and P[Ia\E) re¬ 
spectively, can still be deemed as a successful advertising 
model, given that typical conversion rates of online adver¬ 
tisements can be as small as 1% ( |Kazienko and Adamski] 
June 2007 1 . Our analysis can also shed light on possible 
tweaks of the way they are offered. For example, recent liter¬ 
ature (Cramer, Rost, and Holmquist 20111 has brought onto 


surface possible reasons that lead people to check-in to a 
location long after they arrive. This means that these users 
might have not even used the social application to explore 
the area they are in and thus, they have not been aware at 
all about a special deal that was in the vicinity. Therefore, 
more active communication channels for these campaigns 
might be required (e.g., geo-fenced push notifications). The 
way that a promotion is redeemed might also play a role. 
For example, some deals require users to have an American 
Express card. Furthermore, venues might combine their on¬ 
line promotions with other offline campaigns that can further 
improve the effectiveness of both advertising means. Unfor¬ 
tunately, our analysis cannot account for this due to the lack 
of appropriate information. 

From a technical point of view we have used bootstrap 
techniques for our hypothesis tests in order to avoid strong 
assumptions of standardized tests. Nevertheless, bootstrap 
relies on the assumption that the obtained sample is rep¬ 
resentative of the population. In our case the representa¬ 
tiveness of the sample might be challenged by its possibly 
small size (e.g., for promotions periods that last only one 
week). Furthermore, the interpolation performed on the raw 
time-series might have added noise on the empirical boot¬ 
strap distribution obtained. However, we expect that both of 
the distributions under the null and alternative hypotheses to 
have been affected in a similar manner, if at all, and hence 
their relative positions to not have been affected. In addi¬ 
tion, while we have performed block bootstrap resampling 
with block size of 2 in order to account for dependencies 
between check-ins of consecutive days, the dependencies 
might be more complicated. Finally, the quality of the ref¬ 
erence group can be significantly affected by the venues that 
have been created on the social media platform. While we 
have accounted for this, identifying spam venues is beyond 
the scope of this work. Moreover, if in addition to the type 
and location of a venue there are other observed or unob¬ 
served confounding factors that affect the decision to offer a 
promotion, our reference groups might not be able to effec¬ 


tively account for self-selection biases. 


7 Conclusions and Future Work 

We study the effectiveness of special deals that local estab¬ 
lishments can offer through LBSNs. We collect and analyze 
a large dataset from Foursquare using randomization and 
statistical bootstrap. We find that promotions through LB¬ 
SNs do not alter the probability of observing an increase in 
the daily check-ins to a venue, while the underlying stan¬ 
dardized effect size changes only slightly. We also model 
the effectiveness of such offers by extracting three differ¬ 
ent types of features and building classifiers that can provide 
us with an educated decision with regards to the success of 
these promotions. In the future, we opt to incorporate into 
our analysis the lower level categories for the locales as well 
as examine alternative evaluation metrics (e.g., number of 
unique users). Linally, we plan to explore ways to study a 
recently introduced mechanism for advertisements through 
LBSNs ( Lsq-ads 2014| l, which appears to be effective based 
again on anecdotes. 
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